The Biology of Veganism: Plasma Metabolomics Analysis Reveals Distinct Profiles of Vegans and Non-Vegetarians in the Adventist Health Study-2 Cohort

It is unclear how vegetarian dietary patterns influence plasma metabolites involved in biological processes regulating chronic diseases. We sought to identify plasma metabolic profiles distinguishing vegans (avoiding meat, eggs, dairy) from non-vegetarians (consuming ≥28 g/day red meat) of the Adventist Health Study-2 cohort using global metabolomics profiling with ultra-performance liquid chromatography mass spectrometry (UPLC-MS/MS). Differences in abundance of metabolites or biochemical subclasses were analyzed using linear regression models, adjusting for surrogate and confounding variables, with cross-validation to simulate results from an independent sample. Random forest was used as a learning tool for classification, and principal component analysis was used to identify clusters of related metabolites. Differences in covariate-adjusted metabolite abundance were identified in over 60% of metabolites (586/930), after adjustment for false discovery. The vast majority of differentially abundant metabolites or metabolite subclasses showed lower abundance in vegans, including xanthine, histidine, branched fatty acids, acetylated peptides, ceramides, and long-chain acylcarnitines, among others. Many of these metabolite subclasses have roles in insulin dysregulation, cardiometabolic phenotypes, and inflammation. Analysis of metabolic profiles in vegans and non-vegetarians revealed vast differences in these two dietary groups, reflecting differences in consumption of animal and plant products. These metabolites serve as biomarkers of food intake, many with potential pathophysiological consequences for cardiometabolic diseases.


Introduction
Findings from the Adventist Health Study-2 Cohort have demonstrated that vegetarian dietary patterns have been associated with many positive health outcomes, including lower risk of metabolic syndrome (56%) [1], lower incidence of diabetes (39-62%) [2], lower overall mortality (hazard ratio (HR): 0.88, 95% confidence interval (CI): 0.80-0.97), and cardiovascular disease mortality (HR: 0.71, 95% CI: 0.57-0.90) for males [3], as well as lower risk of gastrointestinal cancers (HR: 0.76, 95% CI: 0.63-0.90) and cancer overall (HR: 0.92, processed meats, poultry, and fish consumption, among other items. Frequency categories varied with food type, and portion sizes included three levels: standard, 1 2 or less, and 1 1 2 or more. Nutrient intake data were collected with the use of the Nutrition Data System for Research software versions 4.06 and 5.03 (The Nutrition Coordinating Center, University of Minnesota, Minneapolis, MN, USA) [18]. Validity of dietary intake has been assessed extensively using 24-h diet recalls and biomarkers [17], and methods developed for dealing with measurement error [17,19,20]. Biennial health and hospitalization history forms (HHF) captured changes in exposures as well as lifetime dietary pattern trends for AHS-2 members [21]. The current study included vegans and non-vegetarians from the AHS-2 cohort who were previously recruited to participate in one of various substudies, where they were asked to attend clinic and provide blood, urine, saliva, and/or adipose samples. These included a calibration study [17], Biological Manifestations of Religion Substudy (BioMRS), which was nested within the Biology, Religion, and Health substudy [22] and other pilot studies (Supplementary Materials, Figure S1). For the calibration and pilot studies, fasting blood samples were collected at field clinics held in church halls, as described previously, where healthy participants were selected randomly by church (Adventist churches were randomly selected from those within the US and Canada, weighted by church size) and then subject within church [23], while BioMRS participants were recruited to clinic sites in Loma Linda, Riverside, and Los Angeles, CA, USA [22]. Anthropometric data were also collected during clinic visits. For the current cross-sectional study, 96 subjects, including vegans and non-vegetarians were selected with stratified random sampling, balancing by sex and race, and excluding individuals with extreme body mass index (BMI) (<14 or >50) and total caloric intake of <500 or >4500 kcal/day. Vegans were defined as never or rarely (less than once per month) consuming eggs, dairy, fish, and other meats, based on responses to the FFQ. Non-vegetarians were defined as consuming non-fish meats at least once a month and any meat (including but not only fish) more than once per week. Vegans were of primary interest for this pilot study because of their complete avoidance of animal products and high consumption of fruits, vegetables, whole grains, legumes, and soy [24]. Non-vegetarians, who consume relatively low amounts of non-fish meats relative to the general population, were selected only if red meat consumption was ≥28 g/day (1 ounce) as an attempt to maximize the contrast between the two groups, although the majority (>70%) consumed more than 56 g/day (2 ounces). Viable plasma samples from a total of 46 non-vegetarians and 47 vegans were included in the current study for metabolomics profiling. The procedures followed were in accordance with the ethical standards of Loma Linda University and approved by the institutional review board for research involving human subjects.

Metabolomics Profiling
Ninety-three heparin plasma samples were analyzed by Metabolon Inc. (Durham, NC, USA). Samples were profiled using a Global Assay (DiscoveryHD4) (Metabolon Inc., Durham, NC, USA) that provides a comprehensive picture of metabolites within central carbon metabolism and various biochemical classes/pathways including amino acid, nucleotide, carbohydrate, lipid, xenobiotic, microbial, and others. The Metabolon platform yielded 1017 compounds of known identity.
Samples were prepared using the automated MicroLab STAR ® system from Hamilton company (Reno, NV, USA), as described previously [25]. After precipitation of proteins, the resulting extracts were analyzed on four independent ultra-high-performance liquid chromatography-tandem mass spectroscopy (UPLC-MS/MS) platforms. A thorough description of the metabolic platform and quality control procedures have been described previously [25]. Various controls were used in tandem with experimental samples, prior to injection into the mass spectrometer. A cocktail of QC standards was added to each sample to monitor instrument performance and aid in chromatographic alignment. Aliquots of a pooled plasma sample (obtained by taking a small amount of each experimental sample) were included as technical replicates. Extracted water samples served as process blanks.
Experimental samples were randomized across the platform run with QC samples spaced evenly among the injections. Biochemical compounds were identified by comparison to library entries of purified, authenticated standards or recurrent unknown entities. The median relative standard deviation (RSD) was calculated for the standards added to each sample as a measure of instrument variability, as well as for all endogenous metabolites within the pooled plasma samples (non-instrument standards) as a measure of overall process variability. The median RSD for internal standards and endogenous metabolites was 6% and 14%, respectively. Samples were measured in one batch and randomized by diet group.

Data Transformation and Linear Regression
Raw metabolite values were median scaled (divided by the median value), and missing metabolite values (below the detection threshold) were imputed with the minimum value for a given metabolite. Data was subsequently log transformed. Metabolites that were below the detection limit for >50% vegans and omnivores were excluded from the analysis, yielding an analytical set of 930 metabolites.
Linear regression models were generated to determine if individual plasma metabolites differed between vegans and non-vegetarians, using smart surrogate variable analysis (SmartSVA). With this approach, regression residuals, obtained from regression of logtransformed metabolite abundance (response) on diet group and covariates-age at blood collection, sex (male vs. female), race (Black vs. White), and BMI (continuous)-were used to obtain surrogate variables representing other unknown, unwanted sources of variation. A linear model was then fitted where the dependent and independent variables were the residuals obtained from regressing metabolite abundance, and dietary pattern, respectively, on surrogate variables and the other covariates, excluding BMI, which was considered as a mediating variable. Models including energy intake, which differed between vegans and non-vegetarians at baseline, did not show appreciable changes in results, so energy intake was not included in final models. The resulting univariate beta coefficients and adjusted predicted means with 95% confidence intervals were then obtained for each metabolite. The adjusted means estimate log (geometric means of untransformed data), and the difference in these between vegans to non-vegetarians, are estimates of log (fold change) metabolite abundance. Linear regression models were also generated without SmartSVA, regressing metabolite abundance on dietary pattern, adjusting only for age, sex, and race. An analysis of differential abundance of metabolites at the subclass level used the same approach, but operated on the average of the numerators of the individual metabolite t statistics, i.e., regression beta coefficients that contributed to that subclass. Composite t-statistics were produced by dividing by the standard deviations of the averaged numerators taking account of the covariances between subclass members. Each component metabolite had been measured as a multiple of its own median, then log-transformed. The differences between predicted means of subclass members in vegans compared to non-vegetarians were calculated. These estimated logs (the ratio of composite geometric means of untransformed values) were finally exponentiated to produce estimated fold changes. All statistical analyses were performed in R Statistical Software (version 4.0.2; R Core Team, Vienna, Austria).

Adjustment for False Discovery
The partial t-statistics from the regressions described above correspond to differences in metabolite abundance according to dietary pattern. To control for multiple testing, an adapted Storey et al. [26] permutation approach was used. The residualized dietary pattern variables were permuted as a means of defining the null distribution of the t scores for metabolite abundance [27], thereby retaining covariances between residualized metabolite abundances. Estimating the proportion of null metabolites allows an estimate of the false discovery rate (FDR) avoiding the over-conservative Benjamini-Hochberg approach [26] and the consequent selection of metabolites with small FDR. An identical procedure was also used at the subclass level.

Principal Components Analysis
Principal components analysis (PCA) was performed using the FactoMineR package in R on 930 log-transformed metabolites as a dimension reduction approach to identify principal components (PCs) or axes that maximize/explain the variation in metabolite abundance. This was done to test the hypothesis that a vegan or non-vegetarian diet could be defined by select groups of metabolites. A PCA plot was generated by obtaining individual scores (coordinates) of vegans and non-vegetarians for top PCs (PC1 and PC3) in 92 subjects, excluding one outlier. Ten PCs explained 50% of the variance. These were retained for regression analysis (eigenvalues >20), where associations of the 10 PCs with diet group and other dietary covariates (energy-adjusted) of interest were examined. Partial correlation coefficients between PCs and diet group or other variables of interest were also obtained (using ppcor package in R). For the PCA regression, variables with missing values (BMI, n = 2; Kcal, n = 1) were imputed with the mean value. Any missing dietary data was handled using multiple imputation with appropriate standard errors. Metabolites with loadings of 0.5 or higher (representing metabolite correlations with PCs) were identified for 10 components.

Random Forest Analysis
Random forest analysis was also used as a supervised approach for classifying metabolites to identify the most informative metabolites distinguishing vegans from nonvegetarians. All 930 log transformed metabolites were considered in the analysis, with each of 50,000 trees learning from a random sample of fifty percent of all the data (23 vegans and 23 non-vegetarians) without replacement, and the remaining data (representing the out of bag variables) passed down the tree for class prediction to calculate the out of bag (OOB) error. Mean decrease in accuracy was calculated by randomly permuting a variable (metabolite), and subsequently passing the data down the trees for re-assessment of class prediction. Hence, the most influential metabolites were determined after permuting each predictor variable and measuring the change or decrease in predictive accuracy [28].

Bootstrap Regression
Our bootstrap procedure chose 930 subjects with replacement, and then in each of 100 such choices performed the FDR analysis. At the metabolite level, 129 of the 930 metabolites were significant (FDR < 0.05) in at least 90% of all bootstrap samples (44 significant 100% of samples). At the subclass level, there were 18 significant (FDR < 0.05) at least 90% of the time, and 9 significant in 100% of the samples. The metabolites and subclasses are listed in the Supplementary Materials (Tables S16-S19).
While the FDR < 0.05, should be a relatively unbiased estimator of the fact that only 5% of such metabolites will be erroneously selected as significant, it gives no information about truly differential metabolites that by chance had a t-score that just missed the FDR cut-point (Type II error). This was assessed by identifying metabolites significant in >50% of bootstrap samples. The bootstrap result provided added information by tending to identify metabolites that may be part of the 5% of false positives (they would show significance in few samples), while also identifying some metabolites that by chance missed significance in the parent sample, but achieved it in many of the bootstrap samples.

Cross-Validation
A separate consideration is that the regressions run to adjust for confounding are somewhat overoptimistic in the t-scores that they produce for the beta coefficients, as likelihoods to some extent are being maximized to also reflect random idiosyncrasies of our sample. This can be largely overcome by a cross-validation procedure, as follows: Randomly, divide the participant sample to K parts. Excluding one part at a time, develop a regression model using subjects in the remaining K-1 parts, which is then used to predict new predicted dependent values (Y(p)) of all metabolites for subjects in the Kth partition. Finally, all subjects receive such a set of Y(p) values. New improved estimates of the residual regression variances are then Sum(Y−Y(p)) 2 /N for each metabolite, and these are always larger than those estimated from the total sample. Noting that var(beta) for a particular metabolite in the full sample equals residual variance/(N.var(X)), a new somewhat smaller t score (beta/sqrt(Var(beta)) is calculated using the improved estimates of residual variances, and these smaller t-scores are submitted to the FDR procedure.
For this study, K = 10 partitions were used, and the increase in residual variances, and corresponding changes in t-scores was small. Very few metabolites lost significance as compared to the original results (see Supplementary Materials, Tables S14-S15).

Baseline Characteristics
Plasma metabolomics profiling was performed on 93 participants-47 vegans and 46 non-vegetarians of the AHS-2 cohort. Roughly equal numbers of male and female and African American and Caucasian participants were included in this study and balanced among the two diet groups. Hence, there were no significant differences at baseline in race or sex comparing these two groups (Table 1). However, vegans were older than non-vegetarians (mean 66.5 vs. 60.8 years), and BMI was significantly higher in nonvegetarians (31.3 vs. 24.7), consistent with greater mean dietary kcals/day. Intakes of select foods or nutrients differed greatly by diet group, as non-vegetarians had significantly higher intakes of red meat, total meat, poultry, fish, dairy, and saturated fat (p < 0.001), and vegans had higher intakes of fiber, fruit, vegetables, soy, legumes (p < 0.001), and whole grains (p < 0.022). Significant differences in all lifestyle factors were seen when comparing vegans with non-vegetarians, most notably coffee drinking, where 39% of non-vegetarians consumed coffee (once or more per month) compared to 0% of vegans (p < 0.001). A significantly greater proportion of non-vegetarians also had a history of smoking (p = 0.008), alcohol drinking (p = 0.015), and used aspirin or non-steroidal antiinflammatory drugs (NSAIDS) (p = 0.038), while the number of minutes of exercise per week for vegans was more than twice as long as non-vegetarians (132 vs. 64 min/wk; p = 0.003).

Linear Regression to Analyze Abundance of Individual Metabolites
Linear regression of individual metabolites with SmartSVA yielded a total of 586 differential metabolites after adjustment for false discovery. The top 40 metabolites present at higher abundance in vegans relative to non-vegetarians ordered by fold change are shown in Table 2. Fold changes of metabolites comparing vegans with non-vegetarians were derived from geometric mean ratios, based on adjusted means. Differences up to nearly 7-fold were observed. Several top metabolites with fold changes above 2-fold were lipids and xenobiotics. In addition, among metabolites showing greatest abundance in vegans were compounds within amino acid subclasses-including methionine/cysteine metabolism, urea cycle, tryptophan, and tyrosine metabolism, besides lipid subclasses including dicarboxylic acids and bile acid metabolism, among others (Table 2).  However, the vast majority of differential metabolites differentially abundant at FDR <0.05 (422/586 = 72%) were decreased in vegans. Metabolites showing the most marked decreases in vegans were again prominently xenobiotics, followed by lipids and amino acids (Table 3). This included metabolites from primarily xanthine, histidine, food component, dicarboxylic acid, and drug metabolism, where values were~3 to~25-fold higher in the non-vegetarians. Differentially abundant (positively and negatively associated) metabolites belonging to all major classes were identified (Supplementary Materials, Tables S1-S5).
When surrogate variables were not included in models, a smaller number of metabolites (346) were found to be differentially abundant in vegans relative to non-vegetarians (Supplementary Materials, Tables S6-S11). There was, however, considerable overlap between differentially abundant metabolites (FDR < 0.05) when comparing the regression approaches with and without Smart SVA, as the vast majority of differential metabolites (FDR < 0.05) detected in the regression excluding surrogate variables were identified with inclusion of the SmartSVA approach in the analysis. Metabolites showing greatest fold changes were in strong agreement comparing the two approaches (Tables 2 and 3,  Supplementary Materials Tables S6 and S7).

Linear Regression to Analyze Metabolite Subclasses
Regression analysis of metabolite subclasses with at least two component metabolites identified 50 differentially abundant subclasses at FDR < 0.05 (Table 4) of 93 total subclasses. Subclasses with increased abundance in vegans prominently included ketone bodies, followed by vitamin A metabolism, inositol, fatty acid acyl glycine metabolism, lactosylceramides, and benzoate metabolism subclasses. For each of these subclasses, all or the majority of the component metabolites significant at FDR < 0.05 were positively associated with a vegan dietary pattern. The directionality of the significant metabolites is notable considering the analysis included all metabolites in a subclass that were represented on the panel. Hence, differential abundance of subclasses, and directionality, were highly driven by component metabolites which had reached statistical significance in linear regression analysis.
Similar to findings of differential abundance of individual metabolites, the majority of subclasses showing statistical differences between vegans and non-vegetarians were negatively associated with vegans ( Table 4). The subclasses with the greatest negative associations in vegans (>1.5-fold change) included xanthine metabolism, drug, branched fatty acid, histidine metabolism, acetylated peptides, ceramides, and dihydroceramides, where all or the vast majority of the statistically significant component metabolites were in decreased abundance. Additionally, all or the vast majority of statistically significant metabolites within many other negatively associated subclasses were also inversely associated with a vegan diet. These included subclasses representing long-chain acyl carnitine metabolism, long-chain saturated fatty acids, lysoplasmalogens, phenylalanines, long-chain monounsaturated fatty acids, monoacylglycerols, and leucine, isoleucine and valine, which are branched chain amino acids (BCAA), among other subclasses. Very similar results were obtained in the analysis excluding surrogate variables, where the top subclasses showing the greatest positive or negative associations with vegans were represented (Supplementary Materials, Table S12) (Names of individual, statistically significant metabolites in each of these differentially abundant subclasses are listed in Supplementary Materials, Table S13).  1 Linear regression analysis with SmartSVA based on composite t-statistics generated by dividing standard deviation of averaged numerators representing log transformed metabolites. 2 Adapted Storey et al. [26] permutation approach used to adjust for false discovery. 3 Number of metabolites within subclass that were inversely associated with a vegan diet among those significantly differential in the linear regression analysis. 4 Number of metabolites within subclass that were positively associated with a vegan diet among those significantly differential in the linear regression analysis. 5 Total number of metabolites measured.

Cross-Validation and Bootstrapped Regression for Error Analysis
Linear regression with cross-validation was performed to identify possible metabolites or subclasses falsely rejected as null or non-null. Cross validation yielded results very similar to those obtained with the entire sample, with only 15 metabolites originally found to be differential at FDR < 0.05 not showing significance with cross-validation (type I error) (Supplementary Materials, Table S14), and no metabolites showing significance with cross-validation that were not identified in the analysis of the entire sample (type II error). All 50 subclasses found to be differential with analysis of the entire sample were also significantly differential with cross-validation, with potentially three additional subclasses that were not identified in analysis of the entire sample (Supplementary Materials, Table S15). Bootstrapped regression models also very much coincided with results of the non-bootstrapped regression analysis, with no more than three potentially new, non-null metabolites or subclasses identified (Supplementary Materials, Table S16). The majority of bootstrapped samples showed numbers of differential metabolites or metabolite subclasses comparable to those obtained in the non-bootstrapped analysis, and all metabolites or subclasses that were differential in at least 90% of bootstrapped samples were also differential in the non-bootstrapped regression analysis (Supplementary Materials, Tables S17-S19).

Random Forest Analysis for Classification by Diet Group
Random forest analysis was used to determine the ability of metabolites to identify the vegan and non-vegetarian dietary classes and to identify metabolites with the greatest predictive accuracy. The highest ranked metabolites with the ability to distinguish vegans from non-vegetarians (with greatest mean decrease accuracy determined by the out-ofbag error/permutation method) included 3-bromo-5-chloro,2-6dihydroxybenzoic acid, 3-methylhistidine, 1-methyl-5-imidazoleacetate, (14 or 15)-methyl palmitate (a17:0 or i17:0), n,n,n-trimethyl-5-aminovalerate, and sphingomyelin (d18:1/17:0, d17:1/18:0, d19:1/16:0) (Figure 1). Classification by diet group showed a predictive accuracy of 92.5%, with a misclassification (out of bag) error of 7.5%. These influential metabolites overlapped with metabolites determined to be highly statistically significant or have large fold changes from the linear regression analysis.

Principal Component Analysis for Dimension Reduction
Principal component analysis was used to collapse data into orthogonal components and generate clusters of correlated metabolites. Nine hundred thirty metabolites were collapsed into 91 PCs explaining 100% of the variance, for 92 subjects (with exclusion of one outlier). Ten PCs with eigenvalues >20 explained 50% of the variance, with eight of these PCs containing metabolites with loadings >0.5. Metabolites driving PC1 included predominantly long-chain acylcarnitines and ceramides, followed by branched fatty acids and long-chain saturated fatty acids (Supplementary Materials, Figure S2 and Table S20). Most prominent subclasses represented in PC3 included long-chain polyunsaturated fatty acids, and dicarboxylic acids, and there was representation of other types of fatty acids (monohydroxy fatty acids, long-chain monounsaturated fatty acids). Lysophospholipids were the predominant subclass represented in PC4 (Supplementary Materials, Table S20). The top four PCs accounted for nearly 1/3 of the variance, with PC1 and PC3 most clearly separating dietary groups. (Figure 2).
Linear regression was performed to examine associations of each of the 10 PCs with various dietary variables of interest, and correlations determined. Significantly associated PCs (p ≤ 0.05) are shown in Table 5. PC1, PC3, and PC4 were associated with a vegan diet (combined r = −0.5, p = 7.9 × 10 −7 ). These PCs were also associated with consumption of red meat, total meat, processed meat, and poultry, along with fiber and saturated fat, with partial correlations ranging from r = +/− 0.41 to 0.61. The first four PCs were highly correlated with consumption of fish (r = 0.52, p = 2.6 × 10 −7 ), and dairy kcals (r = 0.61, p = 6.4 × 10 −10 ). Correlations of red meat, total meat, and processed meat with these PCs were largely attenuated in expanded models including additional dietary covariates. Inclusion of poultry, fish, dairy, and fruit in the model attenuated associations between red meat or total meat and PCs, and an association remained only with PC1 for processed meat. After adjustment for saturated fat and fiber, significant associations remained for red meat and total meat with PC4 (β = 1.7, p = 0.009; β = 1.47, p = 0.008) (Supplementary Materials, Table S21).

Discussion
Among participants in the AHS-2 cohort, distinct metabolic profiles for vegans and non-vegetarians were discovered, with over 60% of metabolites being significantly discriminatory after adjustment for false discovery. Clearly, the serological characteristics of vegans and non-vegetarians differ substantially. The most notable metabolites more abundant in vegans belonged to categories mostly related to plant-food intakes. Those more abundant in non-vegetarians included subclasses of lipids and amino acids, which are mostly related to intakes of animal foods, besides xenobiotics reflecting other lifestyle behaviors such as caffeine consumption and medication use.
Prominent among metabolites more abundant in vegans were products of benzoate metabolism derived from polyphenols in plant foods, possibly also reflecting gut microbial activity. For example, the top compound, 4-ethylphenyl sulfate, is generated by the metabolism of soy protein by gut bacteria [29][30][31], and other metabolites may be generated through microbial metabolism of dietary polyphenols (hippurate metabolites, catechol sulfate, and others). Higher abundance of other food component/plant metabolites (glucopyranoside metabolites, stachydrine), besides vitamins in vegans, similarly reflects consumption of fruits, vegetables, and herbs [32,33], which have roles in reducing risk of chronic diseases. These findings in vegans are consistent with those obtained by Wu et al., where increases were found in benzoate metabolism products and polyphenolic plant compounds as well as gut microbial metabolites (hippurate, 4-ethylphenyl sulfate, 4-hydoxyhippurate, catechol sulfate, phenol sulfate). Notably, vegans showed a significant increase (30%) in butyrate, a short-chain fatty acid that is generated with increased fermentation of nondigestible carbohydrates or dietary fiber, and has a role in regulating inflammation and epithelial barrier function.
Other metabolites that were more abundant in vegan serum have relevance to lipid metabolism and metabolic homeostasis. The cause of the observed higher abundance of ketone bodies in vegans is not clear but may be related to the length of the overnight fast, possibly longer in vegans, caloric restriction, or exercise, which was more frequent in vegans. The acyl glycine subclass of lipids also showed higher abundance in vegans. Acyl glycines are metabolites of fatty acids with important roles in lipid signaling, some with anti-inflammatory ability [34]. There is evidence of negative regulation of acyl glycines by branched chain amino acids [35] through lowering of glycine. Hence, the lower abundance of branched chain fatty acids in vegans in our study might explain in part the higher abundance of acyl glycines, and of primary glycine-conjugated bile acids (glycohyocholate, glycochenodeoxycholate, glycohyocholate, glycol-beta-muricholate.).
The biological and pathophysiologic effects of these differences are several. Carotenoids, associated with vitamin A, and other polyphenols as well as microbial metabolites produced during breakdown of dietary fiber, have anti-inflammatory and antioxidant properties. These compounds counteract oxidative stress and support immune function and gut health to prevent cancer, diabetes, cardiovascular and other diseases. This may happen through inhibition of nuclear factor of activated B-cells (NFkB), regulation of inflammatory cytokines through epigenetic modifications, and increased transcription of antioxidant defense and xenobiotic detoxification genes [36][37][38]. Ketone bodies have beneficial roles in energy metabolism and glucose homeostasis, and thus may prevent or counteract inflammation and oxidative stress [39,40]. Glycine levels have also shown inverse associations with cardiometabolic disease phenotypes [41][42][43][44]. Higher abundance of glycine, along with glycine-conjugated bile acids, has been reported in other cross-sectional studies of vegans and vegetarians [13,[45][46][47][48].
The vast majority of metabolites showed lower abundance in vegan serum most often reflecting the absence of animal products in the diet, or the much lower intakes of caffeine and use of medications. Many metabolites within the xanthine metabolism subclass, significantly different in vegans and non-vegetarians, are metabolites of caffeinetheophylline, paraxanthine, 5-acetylamino-6-amino-3-methyluracil, and several others, and would reflect the greater coffee consumption (or perhaps medication use) in nonvegetarians. Other xenobiotic metabolites present at significantly lower abundance in vegans included chemical and drug metabolites, which can be explained by the increased use of acetaminophen and other NSAIDs in non-vegetarians. The large (3-fold) decrease in the acetylated peptides (phenylacetylglutamine, phenylacetyl carnitine) in vegans is likely a consequence of metabolism of phenylalanine (converted to phenylacetate), along with glutamine or L-carnitine, which may increase with higher intake of animal meats, protein, or certain pharmaceutics (NSAIDs). More notable was the high abundance of a large number of histidine metabolites and branched chain amino acids-isoleucine, leucine, and valine-in non-vegetarians, likely reflecting dietary consumption of meat and animal products [49][50][51]. 1-and 3-methylhistidine, for example, are biomarkers of skeletal muscle protein breakdown [51,52]. Histidine and other imidazole-containing compounds may also be related to use of pharmaceuticals [49].
The markedly lower abundance of various lipid metabolites in vegans such as ceramides and dihydroceramides, long-chain acylcarnitines, long-chain saturated fatty acids, monoacylglycerols, and branched chain fatty acids may reflect reduced intake of saturated fats and lipids/sphingolipids derived from animal sources in the diet. Only lactosylceramides were increased in vegans, possibly reflecting consumption of the precursor glucosylceramide from dietary plant sources (i.e., soy, wheat) [52].
Our findings are largely consistent with previous studies examining metabolic profiles associated with vegetarian/vegan, and plant-based diets. Decreases in sphingolipids and some acylcarnitines have been observed in vegans relative to omnivores in a cross-sectional study of individuals in the European Prospective Investigation into Cancer and Nutrition (EPIC) cohort following habitual dietary patterns [13]. In another study of individuals following a vegan diet for at least six months, decreases in phospholipids, saturated fatty acids, 3-carboxy-4-methyl-5-propyl-2-furanpropanoic acid (CMPF), and methylhistidine, but increases in plant-derived and microbial metabolites, were observed [14]. Further, the Mediterranean dietary pattern has been associated with alterations in acylcarnitines and phospholipids [10]. Additionally, ceramides, plasmalogens and acylcarnitines were positively associated with a Western dietary pattern in a cross-sectional study examining associations of plasma metabolites with Western or Prudent dietary patterns in the Women's Health Initiative cohort [8].
There are important biological and pathophysiologic implications of these differences. While histidine may have some beneficial health effects (i.e., antioxidant activity), histidine metabolites and related imidazole derivatives may be associated with impaired insulin signaling, type 2 diabetes, and kidney disease [53,54]. Several cross-sectional studies have shown positive associations of branched chain amino acids with cardiovascular disease risk, metabolic dysregulation, impaired glucose signaling and insulin resistance [55][56][57], and a number of studies have shown positive associations with type 2 diabetes [58][59][60]. Ceramides are critical lipid signaling molecules, and have various and complex biological roles. Both ceramides and sphingolipids may play a role in insulin resistance [61][62][63]. Ceramides have been found to accumulate in tissues of obese individuals, and have been associated with inflammation [64], and also have roles in the regulation of apoptosis and development of neurological disorders [65][66][67]. Lactosylceramides may have a role in promoting innate immunity [68], besides other biological functions. Phenylacetylglutamine, among others of these metabolites, may reflect activity of gut microbiota, and has been associated with cardiovascular disease risk [69][70][71]. Acylcarnitines are formed in the mitochondria during beta oxidation of fatty acids, but increases in plasma may reflect metabolic disorders [72], as these compounds have been associated with insulin resistance, diabetes, and increased cardiovascular disease risk. Similarly, long-chain saturated fatty acids are implicated in increased risk of obesity and cardiometabolic diseases [73]. Monoacylglycerols are converted to triacylglycerols, which are associated with heart disease, obesity, and metabolic syndrome [74]. Branched chain fatty acids, on the other hand, which are derived largely from dairy and meat products (though also synthesized by gut bacteria from branched chain amino acids in herbivores [75]), may favorably influence metabolic health. Accordingly, they have potentially beneficial effects on insulin sensitivity and weight management, and may attenuate inflammation [76][77][78], but further human studies are needed.
Vegetarians and particularly vegans have notably higher consumption of fruits, vegetables, legumes, whole grains, soy foods, and nuts, and markedly reduced or absent intake of animal products, as demonstrated previously in the AHS-2 cohort [25]. Additionally, compared to non-vegetarians, they have more favorable fatty acid profiles, including lower saturated fatty acids, and higher total omega-3, along with higher levels of phytochemicals such as carotenoids, enterolactone, and isoflavones in plasma or urine [15], but lower inflammatory cytokines [79]. Importantly, vegetarians have shown significantly reduced risks of diabetes, hypertension, cardiovascular disease, select cancers, and all-cause mortality relative to non-vegetarians [1][2][3]80]. These differences in disease and biomarker profiles between vegetarians and non-vegetarians coincide with the differences in plasma metabolites identified in the present study.
Strengths of the current study include well-defined diet groups (vegans and nonvegetarians) reflecting habitual dietary patterns [21], and the inclusion of higher meat consuming non-vegetarians. An additional strength is the analytical rigor applied with the use of multiple regression and other approaches, including adjustment for surrogate variables to address additional confounding and unwanted variation, besides classification and dimensionality reduction approaches. Further, adjustment for false discovery with use of the adapted Storey et al. [26] method allowed for an accurate estimation of the proportion of t-statistics from a non-null distribution, which provides an improvement in power compared to other approaches. AHS-2 participants have provided extensive dietary data, besides demographic and medical data, strengthening the analysis of the current study. Limitations of the study are the single measurement of plasma metabolites, and the somewhat limited sample size, although there was sufficient power to detect a large number of statistically significant differences between vegans and non-vegetarians. Estimates of results from an independent sample are obtained by cross-validation. There are notable lifestyle differences in this Adventist cohort compared to other vegan and non-vegetarian individuals. Seventh-day Adventists place an emphasis on health and wellness, and there are religious guidelines prohibiting certain lifestyle behaviors (smoking, alcohol drinking, biblically unclean meats), but no prohibitions for clean meats in general or dairy. These healthy practices, which limit confounding by these factors, may possibly also limit generalizability of results. The overall reduced consumption of non-fish, and particularly red meats among AHS-2 non-vegetarians relative to the general population, was a limitation, although this was partially overcome by selection of subjects with higher meat consumption. As many diet-derived metabolites are converted by gut bacteria, it remains to be understood how differences in these metabolites might be linked with alterations in composition of the gut microbiome.

Conclusions
In conclusion, in this study we report marked differences in metabolic profiles between vegans and non-vegetarians. Our results suggest that multiple potentially bioactive metabolites are increased by consumption of plant-based foods, and may lower the risk of metabolic diseases through anti-inflammatory mechanisms. On the other hand, diets high in animal products may lead to increases in various amino acids and lipid species (acyl carnitines, saturated fatty acids, ceramides, branched chain amino acids) that promote chronic diseases by increasing inflammation and insulin dysregulation, so disrupting metabolic homeostasis. The exact roles or physiological functions of other differentially abundant metabolites are not clear. It may be that some differentially abundant metabolites in vegans and non-vegetarians serve only as markers of different foods or eating patterns, while others also have important pathophysiological consequences. This study helps lay the foundation for a deeper understanding of the relationship of diet-associated metabolites to the pathophysiology of chronic diseases.
Supplementary Materials: The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/nu14030709/s1; Table S1. Amino acid metabolites associated with a vegan (relative to non-vegetarian) dietary pattern at FDR < 0.05 in linear regression models with SmartSVA approach; Table S2: Lipid metabolites associated with a vegan (relative to nonvegetarian) dietary pattern at FDR < 0.05 in linear regression models with SmartSVA approach; Table S3: Carbohydrate, cofactor/vitamin, and energy metabolites associated with a vegan (relative to non-vegetarian) dietary pattern at FDR < 0.05 in linear regression models with SmartSVA approach; Table S4: Nucleotides, partially characterized molecules, and peptides associated with a vegan (relative to non-vegetarian) dietary pattern at FDR < 0.05 in linear regression models with SmartSVA approach; Table S5: Xenobiotic metabolites associated with a vegan (relative to non-vegetarian) dietary pattern at FDR < 0.05 in linear regression models with SmartSVA approach; Table S6: Top 40 metabolites positively associated with a vegan dietary pattern at FDR < 0.05 in linear regression analysis without SmartSVA; Table S7: Top 40 metabolites inversely associated with a vegan dietary pattern at FDR < 0.05 in linear regression analysis without SmartSVA; Table S8: Amino acid metabolites associated with a vegan (relative to non-vegetarian) dietary pattern at FDR < 0.05 in linear regression analysis without SmartSVA; Table S9: Lipid metabolites associated with a vegan (relative to non-vegetarian) dietary pattern at FDR < 0.05 in linear regression analysis without SmartSVA; Table S10: Carbohydrate, cofactors/vitamins, energy, nucleotide, partially characterized, and peptide metabolites associated with a vegan dietary pattern in linear regression analysis without SmartSVA; Table S11: Xenobiotic metabolites associated with a vegan (relative to non-vegetarian) dietary pattern at FDR < 0.05 in linear regression analysis without SmartSVA; Table S12: Metabolite subclasses associated with diet group (vegan vs. non-vegetarian) at FDR < 0.05 without SmartSVA; Table S13: Component metabolites of each subclass associated with a vegan (relative to non-vegetarian dietary pattern at FDR < 0.05; Table S14: Metabolites differentially abundant (FDR < 0.05) in linear regression analysis with full sample that were nondifferential post cross-validation; Table S15: Subclasses differential after cross-validation that were not differential in regression analysis with entire sample; Table S16: Metabolites or metabolite subclasses showing differential abundance between vegans and non-vegetarians (FDR < 0.05) in > 50% of bootstrapped linear regression analyses, but not differential in non-bootstrapped regression analysis; Table S17: Numbers of differential metabolites or metabolite subclasses (at FDR < 0.05) in regression analysis with bootstrap sampling; Table S18: List of 129 metabolites showing differential abundance (FDR < 0.05) in at least 90% of bootstrap regressions; Table S19: Metabolite subclasses showing differential abundance (FDR < 0.05) in at least 90% of bootstrap regressions; Table S20: Top components from principal components analysis and most influential metabolites; Table S21:Adjusted linear regression predicting red meat, processed, and total meat consumption from top principal com-ponents from principal components analysis, with adjustment for additional potential dietary confounders; Figure S1. Study design for individuals in the Metabolomics Pilot Study. Footnotes: (a) The Calibration sub-study was a random sample of the cohort, except for an overweighting of Black subjects so they formed 40% of the total. (b) The two pilot sub-studies were convenience samples of study subjects living in Texas (Black subjects) or Washington State designed to test our bio-sample acquisition strategies. (c) The BioMRS sub-study was a local sample of AHS-2 subjects who had responded to a request to complete a questionnaire containing psychosocial and religiosity questions. They lived within 50 miles of Loma Linda, Riverside, or downtown Los Angeles and were at least 50 years of age, Figure S2. Subclass loadings from first principal component (PC1). Principal component analysis was used to identify components explaining variation in metabolites comparing vegans and non-vegetarians. Metabolite loadings > 0.5 were extracted and averaged across each represented subclass. Funding: This study was funded by Loma Linda University Health through a pilot grant given to PDH in support of AHS-2 research (Basic Science and Translational Research Pilot Funds). Additionally, the research was supported by the Ardmore Institute of Health ("The Adventist Health Study: The next generation in transformational health knowledge"), and NIH National Institute on Minority Health and Health Disparities K01MD015194. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

Institutional Review Board Statement:
The current study was conducted in accordance with the Declaration of Helsinki, and the protocol was reviewed by the Ethics Committee of Loma Linda University Health but determined to be exempt as outlined in federal regulations for protection of human subjects, 45 CFR Part 46.101(b)(4) XXX (5190502), as the study utilized biospecimens collected in a previously approved study, protocol #48134 on 5/28/03.

Informed Consent Statement:
Informed consent was obtained from all subjects involved in the study. All subjects gave their informed consent for inclusion before they participated in the original study approved by the Loma Linda University institutional review board, protocol #48134 in which plasma samples were obtained for the current study.
Data Availability Statement: Data supporting the results of this study will be available upon request, once reviewed and approved by the authors.

Conflicts of Interest:
The authors declare no conflict of interest.